Standard Codon Substitution Models Overestimate Purifying Selection for Nonstationary Data

نویسندگان

  • Benjamin D. Kaehler
  • Von Bing Yap
  • Gavin A. Huttley
چکیده

Estimation of natural selection on protein-coding sequences is a key comparative genomics approach for de novo prediction of lineage-specific adaptations. Selective pressure is measured on a per-gene basis by comparing the rate of nonsynonymous substitutions to the rate of synonymous substitutions. All published codon substitution models have been time-reversible and thus assume that sequence composition does not change over time. We previously demonstrated that if time-reversible DNA substitution models are applied in the presence of changing sequence composition, the number of substitutions is systematically biased towards overestimation. We extend these findings to the case of codon substitution models and further demonstrate that the ratio of nonsynonymous to synonymous rates of substitution tends to be underestimated over three data sets of mammals, vertebrates, and insects. Our basis for comparison is a nonstationary codon substitution model that allows sequence composition to change. Goodness-of-fit results demonstrate that our new model tends to fit the data better. Direct measurement of nonstationarity shows that bias in estimates of natural selection and genetic distance increases with the degree of violation of the stationarity assumption. Additionally, inferences drawn under time-reversible models are systematically affected by compositional divergence. As genomic sequences accumulate at an accelerating rate, the importance of accurate de novo estimation of natural selection increases. Our results establish that our new model provides a more robust perspective on this fundamental quantity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The genetic code can cause systematic bias in simple phylogenetic models.

Phylogenetic analysis depends on inferential methodology estimating accurately the degree of divergence between sequences. Inaccurate estimates can lead to misleading evolutionary inferences, including incorrect tree topology estimates and poor dating of historical species divergence. Protein coding sequences are ubiquitous in phylogenetic inference, but many of the standard methods commonly us...

متن کامل

Pattern of nucleotide substitution and divergence of prophenoloxidase in decapods.

Despite the unprecedented development in identification and characterization of prophenoloxidase (proPO) in commercially important decapods, little is known about the evolutionary relationship, rate of amino acid replacement and differential selection pressures operating on proPO of different species of decapods. Here we report the evolutionary relationship among these nine decapod species base...

متن کامل

Contrasting the efficacy of selection on the X and autosomes in Drosophila.

To investigate the relative efficacy of both positive and purifying natural selection on the X chromosome and the autosomes in Drosophila, we compared rates and patterns of molecular evolution between these chromosome sets using the newly available alignments of orthologous genes from 12 species. Parameters that may influence the relative X versus autosomal substitution rates include the relati...

متن کامل

Contrasting Codon Usage Patterns and Purifying Selection at the Mating Locus in Putatively Asexual Alternaria Fungal Species

Sexual reproduction in heterothallic ascomycete fungi is controlled by a single mating-type locus called MAT1 with two alternate alleles or idiomorphs, MAT1-1 and MAT1-2. These alleles lack sequence similarity and encode different transcriptional regulators. A large number of phytopathogenic fungi including Alternaria spp. are considered asexual, yet still carry expressed MAT1 genes. The molecu...

متن کامل

Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages.

The nonsynonymous (amino acid-altering) to synonymous (silent) substitution rate ratio (omega = d(N)/d(S)) provides a measure of natural selection at the protein level, with omega = 1, >1, and <1, indicating neutral evolution, purifying selection, and positive selection, respectively. Previous studies that used this measure to detect positive selection have often taken an approach of pairwise c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2017